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IMAGINE  MOBILE  ROBOTS  OF  THE  FUTURE,  working  side  by  side  with  humans,  col 
laborating  in  a  shared  workspace.  For  this  to  become  a  reality,  robots  must  be  able  to  do 
something  that  humans  do  constantly:  understand  how  others  perceive  space  and  the  rel¬ 
ative  positions  of  objects  around  them— they  need  the  ability  to  see  things  from  another 
person's  point  of  view.  Our  research  group  and  others  are  building  computational,  cogni¬ 
tive,  and  linguistic  models  that  can  deal  with  frames  of  reference.  Issues  include  dealing 
with  constantly  changing  frames  of  reference,  changes  in  spatial  perspective,  understand¬ 
ing  what  actions  to  take,  the  use  of  new  words  and  common  ground. 

Our  approach  is  an  implementation  informed  by  cognitive  and  computational  theories. 
It  is  based  on  developing  computational  cognitive  models  (CCMs)  of  certain  high-level  cog¬ 
nitive  skills  humans  possess  and  that  are  relevant  for  collaborative  tasks.  We  then  use  these 
models  as  reasoning  mechanisms  for  our  robots.  Why  do  we  propose  using  CCMs  as 
opposed  to  more  traditional  programming  paradigms  for  robots?  We  believe  that  by  giving 
the  robots  similar  representations  and  reasoning  mechanisms  to  those  used  by  humans,  we 
will  build  robots  that  act  in  a  way  that  is  more  compatible  with  humans. 

Hide  and  Seek.  Our  foray  into  this  area  started  when  we  were  developing  compu¬ 
tation  cognitive  models  of  how  young  children  learn  the  game  of  hide  and  seek  [1].  The  pur¬ 
pose  was  to  enable  our  robots  to  use  human-level  cognitive  skills  to  make  the  decisions 
about  where  to  look  for  people  or  things  hidden  by  people.  The  research  resulted  in  a  hybrid 
architecture  with  a  reactive/probabilistic  system  for  robot  mobility  [5],  and  a  high-level 
cognitive  system  based  on  ACT-R  [6]  that  made  the  high-level  decisions  for  where  to  hide 
or  seek  (depending  on  which  role  the  robot  was  playing).  Videos  of  the  robot  playing  a 
game  of  hide  and  seek  can  be  seen  at  www.nrl.navy.mil/aic/iss/aas. 

While  this  work  was  interesting  in  its  own  right,  the  system  led  us  to  the  realization 
that  the  ability  to  do  perspective-taking  is  a  critical  cognitive  ability  for  humans,  particularly 
when  they  want  to  collaborate. 

Spatial  Perspective  in  Space.  To  determine  just  how  important  perspective  and 
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frames  of  reference  are  in  collaborative  tasks  in  shared  space  (and  also  because  we  were 
working  on  a  DARPA  funded  project  to  move  these  capabilities  to  the  NASA  Robonaut),  we 
analyzed  a  series  of  tapes  of  two  astronauts  and  a  ground  controller  training  in  the  NASA 
Neutral  Buoyancy  Tank  facility  for  an  assembly  task  for  Space  Station  mission  9A.  We  per¬ 
formed  a  protocol  analysis  of  several  hours  of  these  tapes  focusing  on  the  use  of  spatial  lan¬ 
guage  and  commands  from  one  person  to  another.  We  found  that  the  astronauts  changed 
their  frame  of  reference  (as  seen  during  their  dialog)  approximately  every  other  utterance. 
As  an  example  of  how  prevalent  these  changes  in  frame  of  reference  are,  consider  this  fol¬ 
lowing  utterance  from  ground  control: 

"...  if  you  come  straight  down  from  where  you  are,  uh,  and  uh,  kind  of  peek 
down  under  the  rail  on  the  nadir  side,  by your  right  hand,  almost  straight  nadir, 
you  should  see  the..." 

Here  we  see  five  changes  in  frame  of  reference  (highlighted  in  italics)  in  a  single  sen¬ 
tence!  These  rates  in  the  change  of  reference  are  consistent  with  work  by  Franklin,  Tversky, 
&  Coon,  1992  [4].  In  addition,  we  found  that  the  astronauts  had  to  take  other  perspectives, 
or  forced  others  to  take  their  perspective,  about  25  percent  of  the  time  [3].  Obviously,  the 
ability  to  handle  changing  frames  of  reference  and  being  able  to  understand  spatial  per¬ 
spective  will  be  a  critical  skill  for  robots  such  as  NASA  Robonaut  and,  we  would  argue,  any 
other  robotic  system  that  needs  to  communicate  with  people  in  spatial  contexts  (i.e.,  any 
construction  task,  direction  giving,  etc.). 

Models  of  Perspective  Taking.  Imagine  the  following  task.  An  astronaut  and  his 
robotic  assistant  are  working  together  to  assemble  a  structure  in  shared  space.  The  human, 
who  can  see  one  wrench,  says  to  the  robot,  "Pass  me  the  wrench."  Meanwhile,  from  the 
robot's  point  of  view,  two  wrenches  are  visible,  while  the  human  has  a  partially  occluded 
view  and  can  only  see  one  wrench.  What  should  the  robot  do?  Evidence  suggests  that 
humans,  in  similar  situations,  will  pass  the  wrench  that  they  know  the  other  human  can  see 
since  this  is  a  jointly  salient  feature  [7]. 

We  developed  two  models  of  perspective  taking  that  could  handle  the  above  scenario 
in  a  general  sense.  The  first  approach  used  the  ACT-R/S  system  [8]  to  model  perspective 
taking  using  a  cognitively  plausible  spatial  representation.  The  second  approach  used 
Polyscheme  [2]  and  modeled  the  cognitive  process  of  mental  simulation;  humans  tend  to 
mentally  simulate  situations  in  order  to  resolve  problems. 

Using  these  models  we  have  demonstrated  a  robot  being  able  to  solve  problems  sim¬ 
ilar  to  the  wrench  problem.  Videos  of  a  robot  and  human  in  this  task  can  be  seen  at 
http :  //www.  nrl .  navy,  mil/aic/iss/aas/ . 

It  is  clear  that  if  humans  are  to  work  as  peers  with  robots  in  shared  space,  the  robot 
must  be  able  to  understand  the  natural  human  tendency  to  use  different  frames  of  refer¬ 
ence  and  to  take  the  human's  perspective.  To  create  robots  with  these  capabilities,  we  pro¬ 
pose  using  CCMs,  as  opposed  to  more  traditional  programming  paradigms  for  robots.  First, 
a  natural  and  intuitive  interaction  results  in  reduced  cognitive  load.  Second,  more  pre¬ 
dictable  behavior  engenders  trust.  Finally,  more  understandable  decisions  allow  the  human 
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to  recognize  and  more  quickly  repair  mistakes. 
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